Automated Planning in Repeated Adversarial Games

نویسندگان

  • Enrique Munoz de Cote
  • Archie C. Chapman
  • Adam M. Sykulski
  • Nicholas R. Jennings
چکیده

Game theory’s prescriptive power typically relies on full rationality and/or self–play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant–sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game’s Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model–based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two–player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament’s actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Planning in Domains with Stochastic Outcomes, Adversaries, and Partial Observability

Real-world planning problems often feature multiple sources of uncertainty, including randomness in outcomes, the presence of adversarial agents, and lack of complete knowledge of the world state. This thesis describes algorithms for four related formal models that can address multiple types of uncertainty: Markov decision processes, MDPs with adversarial costs, extensiveform games, and a new c...

متن کامل

Counterexample-guided Planning

Planning in adversarial and uncertain environments can be modeled as the problem of devising strategies in stochastic perfect information games. These games are generalizations of Markov decision processes (MDPs): there are two (adversarial) players, and a source of randomness. The main practical obstacle to computing winning strategies in such games is the size of the state space. In practice ...

متن کامل

Modified Adversarial Hierarchical Task Network Planning in Real-Time Strategy Games

The application of artificial intelligence (AI) to real-time strategy (RTS) games includes considerable challenges due to the very large state spaces and branching factors, limited decision times, and dynamic adversarial environments involved. To address these challenges, hierarchical task network (HTN) planning has been extended to develop a method denoted as adversarial HTN (AHTN), and this m...

متن کامل

OBDD-Based Optimistic and Strong Cyclic Adversarial Planning

Recently, universal planning has become feasible through the use of efficient symbolic methods for plan generation and representation based on reduced ordered binary decision diagrams (OBDDs). In this paper, we address adversarial universal planning for multi-agent domains in which a set of uncontrollable agents may be adversarial to us. We present two new OBDD-based universal planning algorith...

متن کامل

Adversarial Hierarchical-Task Network Planning for Real-Time Adversarial Games

Real-time strategy (RTS) games are hard from an AI point of view because they have enormous state spaces, combinatorial branching factors, allow simultaneous and durative actions, and players have very little time to choose actions. For these reasons, standard game tree search methods such as alphabeta search or Monte Carlo Tree Search (MCTS) are not sufficient by themselves to handle these gam...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010